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Abstract: We present in-vivo 3D human vocal fold images with 
polarization sensitive optical coherence tomography (PS -OCT). 
Characterizing the extent and location of vocal fold lesions provides useful 
information in guiding surgeons during phonomicro surgery. Previous 
studies showed that PS -OCT imaging can distinguish vocal fold lesions 
from normal tissue, but these studies were limited to 2D cross-sectional 
imaging and were susceptible to sampling error. In-vivo 3D endoscopic 
imaging was performed by using a recently developed 2-axis MEMS 
scanning catheter and a spectral domain OCT (SD-OCT), running at 18.5 
frames/s. Imaging was performed in the operating room with patients under 
general anesthesia and 3D images were acquired either by 2D scanning of 
the scanner on the sites of interest or by combining ID scanning and manual 
sliding to capture whole length of the vocal fold. Vocal fold scar, polyps, 
nodules, papilloma and malignant lesions were imaged and characteristics 
of individual lesions were analyzed in terms of spatial distribution and 
variation of tissue structure and birefringence. The 3D large sectional PS- 
OCT imaging showed that the spatial extent of vocal fold lesions can be 
found non-invasively with good contrast from normal tissue. 
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1. Introduction 

Optical coherence tomography (OCT) is a non-invasive high-resolution imaging technique 
based on light back-reflection from within tissues [1]. OCT provides information on the sub- 
surface micro-structures of tissues to a few mm in depth, and has been found to be useful in 
clinical diagnosis by detecting structural changes non-invasively. Polarization sensitive OCT 
(PS -OCT) is an augmented OCT which measures both intensity and polarization state of back 
reflected light and provides additional information about tissue polarization properties such as 
birefringence [2]. While there are many birefringent tissue types (muscle, arteries, tendons, 
and dermis of skin), type I collagen contained in the deeper layers of vocal folds is 
birefringent and therefore imaged well with PS -OCT. Combining information from 
conventional OCT images and PS-OCT provides much better contrast to distinguish tissue 
layers, and between the normal and lesions than conventional OCT alone. PS-OCT has been 
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applied to various pre-clinical and clinical studies: examples are external organs such as the 
eye [3-7] and skin [8-11], as well as internal organs such as the coronary artery [12,13] and 
vocal fold [14,15]. Internal organ imaging became possible with the development of 
miniaturized scanning catheters. 

Vocal folds are voice generating organs, composed of two folds of musculomembranous 
tissue, located in the middle of the air way between the lung and the mouth. Vocal folds 
consist of a layered microstructure of the epithelium, lamina propria, and muscle. The lamina 
propria is comprised of superficial, medium and deep layers with increasing collagen content 
toward the deeper layers. The most important vibrating layers for vocalization are the 
superficial lamina propria (SLP) and epithelium. Most benign vocal fold lesions are confined 
superficially at the interface of the epithelium and SLP and invasion through the basement 
membrane of epithelium defines malignant lesions. Characterizing the extent and location of 
vocal fold lesions provides useful information in guiding surgeons during phonomicro surgery. 
Since PS-OCT images sub-surface vocal fold tissue non-invasively, it may be useful for 
characterizing vocal fold lesions. Prior work has shown that PS-OCT images the interface 
between epithelium and SLP and detects collagen content within the layers of lamina propria 
[14]. In vivo human vocal fold imaging was previously performed with a conventional ID 
scanning catheter and PS-OCT characterized vocal fold pathologies different from normal 
tissue [15,16]. Although these studies demonstrated the feasibility of PS-OCT imaging in 
vocal folds, imaging was limited to slow (1 frame/sec) sampling of 2D cross-sectional images 
on select areas and provided a small number of cross sectional images. Therefore, pathology 
characteristics could be missed in this imaging. 

Recent advancement of OCT technology has enhanced imaging speed more than 100K 
depth-scans/s, and real time imaging or large sectional imaging has become possible. For 
internal organ imaging, the speed of the scanning catheter is another limiting factor. High- 
speed scanning catheters were developed for tubular organs, and comprehensive imaging 
covering large sections was demonstrated [17,18]. Other high-speed scanning catheters have 
been developed for non-tubular organ imaging [19-21]. We applied a recently developed 2- 
axis MEMS scanning catheter to the vocal fold study to overcome limitations of the previous 
studies such as imaging speed and area [21]. We could image vocal folds in 3D at the speed of 
18.5 frames/s by using the MEMS scanning catheter and a spectral domain OCT (SD-OCT) 
[22]. The new system was applied to in- vivo imaging of various vocal fold pathologies such 
as nodule, polyp, scar, papilloma, and cancer. PS-OCT images and 3D volumes of individual 
pathologies will be presented. 

2. Methods 

Endoscopic OCT imaging was performed by using a 2-axis MEMS scanning catheter in 
conjunction with a multi-functional spectral domain OCT (SD-OCT) system. Details of the 
MEMS scanning catheter and the OCT system can be found in the literature [21,22]. In short, 
the MEMS catheter had a miniaturized scanner fabricated based on MEMS technology at the 
tip, and the scanner could deflect in two orthogonal axes driven magnetically by applying 
voltage waveform to electromagnets made of wound coils for individual axes. The MEMS 
scanner was driven with sinusoidal and linear waveforms for the fast and slow axes 
respectively. Image distortion in the fast axis due to the sinusoidal waveform was corrected 
with subsequent image processing. The catheter body was 2.7 mm in diameter and 12 mm in 
rigid length. In actual imaging, the catheter was sealed with a disposable clear plastic sheath 
(PEBAX, Innovative Medical Design, Amherst, NH) for protection, and the size became 
approximately 3 mm in diameter. The scanning range was more than +/- 28° in optical angle, 
which corresponded to more than 1.5 mm on the surface of the catheter. The multifunctional 
SD-OCT system could do both the intensity and polarization sensitive (PS) imaging 
simultaneously at the speed of 18.5 depth-scans/s, and its imaging depth was 1.5 mm in tissue 
with the assumption of tissue refractive index of 1.4. PS images displayed accumulated phase 
retardation between the fast and slow axes of the sample from the surface in a gray scale. 
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The procedural setup for imaging in- vivo has been previously described in the literature 
[15]. In short, PS-OCT imaging was performed in the operating room under general 
anesthesia. The study protocol was approved by the institutional review board (IRB) of 
Massachusetts General Hospital. The MEMS scanning catheter, which was sealed with the 
disposable plastic sheath, was introduced transorally through a standard laryngeal suction 
catheter. 3D images were obtained with the catheter in contact with the vocal fold and lesion. 
First, individual sections of interest were imaged in 3D with 2D scanning of the MEMS 
scanner while the catheter was held still. Its imaging size was approximately 1.5 mm x 1.5 
mm on the surface. The acquisition time for each section was approximately 5 seconds by 
capturing 100 cross sectional images containing 1000 depth-scans/frame. Next, in order to 
image a large tissue area, the scanning catheter was advanced posteriorly to anteriorly along 
the vocal fold edge in the transverse plane, with the high speed MEMS scanner axis oriented 
in the orthogonal coronal plane. The full length of vocal fold edges, which was slightly more 
than 10 mm, could be imaged with this method. The acquisition time was approximately 15 - 
20 seconds. During the acquisition, images were processed and displayed in real time on the 
screen of an acquisition computer. The whole imaging time of each session was less than 5 
minutes. 

3. Results 

44 vocal folds in 22 patients were imaged with the MEMS scanning catheter. Benign vocal 
fold lesions included scar tissue (N = 2), polyp (N = 2), nodule (N = 2), papilloma (N = 4) and 
other rare benign lesions (N = 5). In addition, seven patients with malignant vocal fold lesions 
(N = 7) were imaged. When clinically indicated, biopsy or complete removal of the lesion 
established histological diagnosis. Representative large sectional 3D PS-OCT images of 
individual lesions are presented in Figs. 1-10. Video clips of reconstructed 3D vocal fold 
images with the progression of cross-sectional surfaces in two orthogonal directions are 
presented together along with an intra-operative wide field image of the vocal fold lesion. 

3.1 Polyp 

The wide field image and 3D reconstructed PS-OCT image of a right vocal fold polyp are 
shown in Fig. 1, and a video clip of the 3D PS-OCT image is shown in Fig. 2. The PS-OCT 
image shows 3D intensity and PS images side by side, and three cross-sectional images at 
selected regions. The video clip shows internal structures of the vocal fold with moving cross- 
sections. In the wide field image, the polyp appears on the mid-portion of the right vocal fold 
as a local swelling. In the PS-OCT image, the polyp appears as a small surface irregularity 
[Fig. 2(b), intensity image], because most of the polyp was compressed by the catheter during 
manual scanning. The polyp appears as a fluid filled sac covered with an outer tissue layer and 
has little or no banding pattern (birefringence) as shown in the cross-sectional image of polyp 
section in Fig. 1(d). 
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Fig. 1. Wide field image (a), 3D PS-OCT image of polyp (b, intensity: left, PS: right), and 
cross-sectional images (c-e). 




Fig. 2. (Media 1) Video clip of 3D PS-OCT image of polyp. 

3.2 Nodule 

Wide-field, 3D reconstructed PS-OCT still images of a vocal fold nodule and a 3D PS-OCT 
video clip are shown in Figs. 3 and 4 respectively. The nodules appear in the mid-portion of 
the vocal folds bilaterally as symmetric swellings in the wide-field image [Fig. 3(a)]. The PS- 
OCT image of the left vocal fold shows irregular tissue structures in the nodule region [arrows 
in Fig. 3(b), intensity image], and the boundary between the epithelium and SLP is not clear. 
The irregular structures could be either the thickened epithelium or loose fibrous structure in 
the SLP. The nodule shows a relatively weak banding pattern, especially in the section 
showing loose tissue structures in the PS image. 
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Fig. 3. Wide-field image (a) and 3D PS-OCT image of nodule (b, intensity: right, PS: left), and 
cross-sectional images (c-e). 




Fig. 4. (Media 2) Video clip of the 3D PS-OCT image of nodule. 

3.3 Papilloma 

Wide-field, 3D reconstructed PS-OCT still images of vocal fold papilloma and a 3D PS-OCT 
video clip are shown in Figs. 5 and 6 respectively. The papilloma appears in the wide-field 
image as an exophytic fibrovascular growth on the right anterior vocal fold -P — papilloma in 
Fig. 5(a). The PS-OCT image of the right side also shows a clear boundary between normal 
tissue posteriorly and papilloma anteriorly [arrow in Fig. 5(b), intensity image]. The normal 
tissue posteriorly shows layered tissue structures with the basement membrane in the intensity 
image [Fig. 5(e)], and a horizontal black and white banding pattern in the PS image [bracket 
labeled H in Fig. 5(b), PS image]. However, the papilloma anteriorly shows homogeneous 
vertically aligned structures rather than layered tissue structure in the intensity image 
[Fig. 5(c)]. Also, the black- white banding pattern of the PS image of papilloma is irregular 
and aligned in the vertical direction rather than in the horizontal [bracket labeled V in 
Fig. 5(b), PS image]. The boundary between normal tissue and papilloma is clear in the PS- 
OCT image [* at boundary between brackets of Fig. 5(b), PS image]. 
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(a) 




Fig. 5. Wide-field image (a) and 3D PS-OCT image of papilloma (b, intensity: left, PS: right), 
and cross-sectional images (c-e). 




Fig. 6. (Media 3) Video clip of 3D PS-OCT image of papilloma. 



3.4 Cancer and carcinoma-in situ 

The images from two representative cancer cases are presented here. The first case is a deeply 
invasive cancer which extends beyond the 1.5 mm imaging depth of the current PS-OCT 
system. Wide-field, 3D reconstructed PS-OCT still images of the thick cancer and a 3D PS- 
OCT video clip for this case are shown in Figs. 7 and 8 respectively. In the wide-field image, 
the cancer is seen bilaterally as a combination of raised irregular white and red areas. The 
intensity image shows homogeneous structures posteriorly and layered structures anteriorly. 
In addition, the intensity image posteriorly turns white quickly with depth due to the fast 
signal decay, compared to the images obtained more anteriorly. This is because the cancer 
tissue is usually highly scattering and suggests deeper cancer invasion posteriorly. In the PS 
image, almost no or irregular banding patterns are seen posteriorly, but clear horizontal 
banding patterns (indicative of normal vocal fold micro structure) are seen anteriorly. Three 
cross-sectional images show the transition from cancer to normal tissue as the cross-section 
goes from posterior to anterior [Fig. 7(e): cancer, Fig. 7(d): transition, Fig. 7(c): normal]. This 
3D PS-OCT image shows a clear distinction between the cancer side (posterior) and normal 
side (anterior) in both the intensity and PS images, but the PS image shows the difference with 
better contrast [* marks this boundary in Fig. 7(b), PS image]. 
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Fig. 7. Wide-field image (a) and 3D PS-OCT image of a deeply invasive cancer case (b, 
intensity: right, PS: left), and cross-sectional images (c-e). 




Fig. 8. (Media 4) Video clip of 3D PS-OCT image of the deeply invasive cancer case. 

The second case is carcinoma-in situ originating on the left vocal fold as shown in the 
wide-field image (Fig. 9). These lesions contain malignant cells but are confined to the 
epithelium with no invasion through the epithelial basement membrane. The 3D PS-OCT 
image of the left vocal fold in Fig. 9 shows the lesion confined superficially within the 
epithelium in both the intensity [arrows in Fig. 9(b)] and PS images [bracket showing loss of 
banding pattern in Fig. 9(b)], and the corresponding video clip (Fig. 10) shows internal 
structures in various cross-sections. The intensity image shows some boundary structures 
between cancerous and normal tissue [Figs. 9(d), 9(e)]. The PS image shows little horizontal 
banding pattern on the superficial layer and a white band occurs in the lower layer indicating 
birefringence [Fig. 9(d)]. There is a clear boundary between the carcinoma-in situ lesion and 
normal tissue. 
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Fig. 9. Wide-field image (a) and 3D PS-OCT image of a carcinoma-in situ case (b, intensity: 
right, PS: left), and cross-sectional images (c-e). 




Fig. 10. (Media 5) Video clip of 3D PS-OCT image of the carcinoma-in situ case. 

4. Discussion 

Various human vocal fold lesions were imaged in 3D with the MEMS scanning catheter and 
spectral domain PS-OCT. Even though the MEMS scanner used in this study had a very large 
scanning angle of 46°, the scanning area was limited to 1.5 mm x 1.5 mm on the surface, 
which covered only a small section of vocal fold. To overcome this limitation, we advanced 
the probe posterior to anterior over the vocal fold to image its entire length. This large 
sectional 3D imaging provided information on the spatial distribution and depth of penetration 
of lesions on vocal folds. Individual lesions were analyzed based on tissue structure and 
birefringence. Most of the vocal fold lesions showed loss of tissue birefringence characteristic 
of a normal SLP, because the lesion's microstructures tend to be disorganized. Therefore, PS 
images of PS-OCT provided a good additional contrast mechanism to distinguish lesions from 
normal tissue. Large sectional PS-OCT imaging was very effective to show spatial 
distribution of vocal fold lesions non-invasively. 

Since most of vocal fold lesions showed loss of birefringence and tissue structure, PS- 
OCT could not differentiate among different lesions and is not be suitable to be used alone. 
PS-OCT needs to be used in combination with wide-field imaging for both the identification 
and spatial extent of vocal fold lesions. 
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Although the manual sliding method of advancing the imaging probe from posterior to 
anterior could extend imaging range to the whole length, there were technical difficulties. 
Since imaging occurs by placing the imaging probe in contact with the tissue, the vocal fold 
surface could be adequately positioned to allow imaging of the superior and inferior surfaces 
of these lesions. During imaging, perturbation occurred due to surface irregularity of the 
lesions and variability in speed during catheter sliding over the vocal fold. Such artifacts could 
be found in reconstructed 3D images. In principle, these artifacts could be compensated with 
image processing such as cross-correlation. Although this manual sliding method could 
extend the screening area significantly in length, this method couldn't cover the entire section 
of the vibrating vocal fold due to narrow width coverage of 1.5 mm. Therefore, there should 
be some way to increase the scanning area in both width and length. Given that the size and 
the scanning angle of the catheter are limited practically, one improvement could be to create 
a greater distance between the scanning catheter and the tissue to be imaged. Then a larger 
area could be imaged with the same scanning angle. However, this would require a larger 
diameter of the catheter. Finally, lesions are typically thicker than the 1.5 mm imaging depth 
range of the OCT. Therefore, OCT imaging may not be able to show the depth of extension of 
thicker and more exophytic lesions. This is a fundamental problem faced by OCT technology 
in studying vocal folds. However, PS-OCT imaging may be still useful post-operatively for 
lesion surveillance since the exophytic part of the lesion would have been excised during 
surgery. 

5. Conclusions 

High quality images of various benign and malignant human vocal fold lesions were attained 
utilizing 3D PS-OCT with the 2-axis MEMS scanning catheter and SD-OCT. Real time 3D 
imaging helped to characterize individual vocal fold lesions by providing their spatial 
distribution and variation, and demarcated the boundary between lesions and normal tissue. 
The full length of the vocal folds could be imaged by manually sliding the MEMS scanning 
catheter along the length of the vocal fold. These large sectional PS-OCT images showed the 
spatial distribution of lesions on the vocal fold in-vivo. 3D PS-OCT vocal fold imaging has 
potentially useful clinical application in imaging the layered micro structure of vocal folds and 
may serve as a useful adjunct in phonomicrosurgery. 
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